Stopping Criterion for Boosting-Based Data Reduction Techniques: from Binary to Multiclass Problems
نویسندگان
چکیده
So far, boosting has been used to improve the quality of moderately accurate learning algorithms, by weighting and combining many of their weak hypotheses into a final classifier with theoretically high accuracy. In a recent work (Sebban, Nock and Lallich, 2001), we have attempted to adapt boosting properties to data reduction techniques. In this particular context, the objective was not only to improve the success rate, but also to reduce the time and space complexities due to the storage requirements of some costly learning algorithms, such as nearest-neighbor classifiers. In that framework, each weak hypothesis, which is usually built and weighted from the learning set, is replaced by a single learning instance. The weight given by boosting defines in that case the relevance of the instance, and a statistical test allows one to decide whether it can be discarded without damaging further classification tasks. In Sebban, Nock and Lallich (2001), we addressed problems with two classes. It is the aim of the present paper to relax the class constraint, and extend our contribution to multiclass problems. Beyond data reduction, experimental results are also provided on twenty-three datasets, showing the benefits that our boosting-derived weighting rule brings to weighted nearest neighbor classifiers.
منابع مشابه
Stopping Criterion for Boosting-Based Data Reduction Techniques: from Binary to Multiclass Problem
So far, boosting has been used to improve the quality of moderately accurate learning algorithms, by weighting and combining many of their weak hypotheses into a final classifier with theoretically high accuracy. In a recent work (Sebban, Nock and Lallich, 2001), we have attempted to adapt boosting properties to data reduction techniques. In this particular context, the objective was not only t...
متن کاملTotally Corrective Multiclass Boosting with Binary Weak Learners
In this work, we propose a new optimization framework for multiclass boosting learning. In the literature, AdaBoost.MO and AdaBoost.ECC are the two successful multiclass boosting algorithms, which can use binary weak learners. We explicitly derive these two algorithms’ Lagrange dual problems based on their regularized loss functions. We show that the Lagrange dual formulations enable us to desi...
متن کاملS Tudents ’ P Erformance P Rediction S Ystem Using M Ulti a Gent Data M Ining T Echnique
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making. Boosting technique is the most popular techniques for constructing ensembles ...
متن کاملOnline multiclass boosting
Recent work has extended the theoretical analysis of boosting algorithms to multiclass problems and to online settings. However, the multiclass extension is in the batch setting and the online extensions only consider binary classification. We fill this gap in the literature by defining, and justifying, a weak learning condition for online multiclass boosting. This condition leads to an optimal...
متن کاملA New AdaBoost Algorithm for Large Scale Classification And Its Application to Chinese Handwritten Character Recognition
The present multiclass boosting algorithms are hard to deal with Chinese handwritten character recognition for the large amount of classes. Most of them are based on schemes of converting multiclass classification to multiple binary classifications and have high training complexity. The proposed multiclass boosting algorithm adopts the descriptive model based multiclass classifiers (Modified Qu...
متن کامل